Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques
نویسندگان
چکیده
This paper proposes how to automatically identify Korean comparative sentences from text documents. This paper first investigates many comparative sentences referring to previous studies and then defines a set of comparative keywords from them. A sentence which contains one or more elements of the keyword set is called a comparative-sentence candidate. Finally, we use machine learning techniques to eliminate non-comparative sentences from the candidates. As a result, we achieved significant performance, an F1-score of 88.54%, in our experiments using various web documents.
منابع مشابه
Finding relevant features for Korean comparative sentence extraction
In this paper, we study how to extract comparative sentences from Korean text documents. We decompose our task into three steps: 1) collecting comparative keywords; 2) extracting comparative-sentence candidates by keyword searching; 3) eliminating non-comparative sentences from these candidates using machine learning techniques. We perform various experiments to find relevant features. As a res...
متن کاملClassifying Korean comparative sentences for comparison analysis
Comparisons sort objects based on their superiority or inferiority and they may have major effects on a variety of evaluation processes. The Web facilitates qualitative and quantitative comparisons via online debates, discussion forums, product comparison sites, etc., and comparison analysis is becoming increasingly useful in many application areas. This study develops a method for classifying ...
متن کاملExtracting Comparative Entities and Predicates from Texts Using Comparative Type Classification
The automatic extraction of comparative information is an important text mining problem and an area of increasing interest. In this paper, we study how to build a Korean comparison mining system. Our work is composed of two consecutive tasks: 1) classifying comparative sentences into different types and 2) mining comparative entities and predicates. We perform various experiments to find releva...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملUsing Machine Learning Techniques for Subjectivity Analysis based on Lexical and Non-lexical Features
Machine learning techniques have been used to address various problems and classification of documents is one of the main applications of such techniques. Opinion mining has emerged as an active research domain due to its wide range of applications such as multi-document summarization, opinion mining of documents and users’ reviews analysis improving answers of opinion questions in forums. Exis...
متن کامل